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Abstract 


Curve fitting is a computational problem in which we look for a base objective function with a set of data points. Recently, 
nonparametric regression has received a lot of attention from researchers. Usually, spline functions are used due to the 
difficulty of the curve fitting. In this regard, the choice of the number and location of knots for regression is a major 
issue. Therefore, in this study, a Genetic Algorithm (GA) simultaneously determines the number and location of the 
knots based on two criteria. Those are the least square error and capability process indices. The proposed algorithm 
performance has been evaluated by some numerical examples. Simulation results and comparisons reveal that the 
proposed approach in curve fitting has satisfactory performance. Also, an example illustrated a sensitivity analysis of the 
number of knots. Finally, simulation results from a real case in Statistical Process Control (SPC) show that the proposed 
GA works well in practice. 


Keywords: Capability process index, Genetic algorithm, Least square error, Spline regression. 


1 | Introduction 


TON With the development of technology in computation and measurement, scientists usually encounter 
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As mentioned in Dierckx [1], the distribution of knots is a nonlinear optimization problem. To solve 


these problems, as one of the first works, Dimatteo et al. [2] used a Bayesian model in Markov Chain 


Monte Carlo (MCMC) in spline regression. Some studies carried out some computations based on 
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nonlinear optimization such as Ahmed et al. [3]. Also, Zhao et al. [4] proposed an adaptive knot 
placement using a generalized mix model-based continuous optimization algorithm based on a B-Spline 
curve approximation. Gazioglu et al. [5] developed penalized regression spline methodology which uses 
all the data and improves the precision of estimation. Also, Lai and Wang [6] applied the asymptotic 
behavior of penalized spline estimators using bivariate splines over triangulations and an energy 
functional as the penalty. After that, Seo et al. [7] proposed an outlier detection method in penalized 
spline regression models. 


Purthermore, Schwarz and Krivobokova [8] developed a unified framework to investigate the properties 
of all periodic spline-based estimators, including regression, penalized, and smoothing splines. 
Moreover, Montoril [9] suggested a spline estimation of the functional-coefficient in regression models 
for time series with correlated errors. In addition, for time series nonparametric regression models with 
discontinuities, Yang and Song [10] used polynomial splines to estimate locations and sizes of jumps in 
the mean function. Papp and Alizadeh [11] applied a shape-constrained estimation using nonnegative 
splines. Moreover, Ma et al. [12] proposed a new method in spline regression in the presence of 
categorical predictors. In recent years, Zhou et al. [13] proposed the polynomial spline method to 
estimate a partial functional linear model. Recently, Daouia et al. [14] developed a novel constrained 
approach to the boundary curve achieved from the smoothness of spline approximation. 


Usually, in most of these techniques, firstly, spline coefficients are estimated, and then the knots are 
selected. These methods display a relatively satisfactory performance; however, they are statistically 
complicated, and sometimes results fall in local solutions. Hence, some researchers have employed 
omission and addition techniques to estimate the knots. In this regard, Powell [15] produced extra knots 
in one variable, and Jupp [16] required an initial estimation of the location of the knots that is not feasible 
in practice. Similarly, Dierckx [1] needed an error tolerance or a smoothing factor to estimate the location 
of the knots at first. Ma [17] obtained a plug-in formula for the optimal number of interior knots based 
on the theoretical results of asymptotic optimality and strategies for choosing them in the spline 
estimator. Wang [18] treated the number and locations of knots as free parameters and used reversible 
jump MCMC to obtain posterior samples of knot configurations. In this work, second-order 
programming is used to estimate the remaining parameters based on the number and location of the 
knots. 


On the other hand, metaheuristic algorithms are computational intelligence paradigms especially used 
for sophisticated solving optimization problems. For example, Engin and Isler [19] proposed a parallel 
greedy algorithm to solve the fuzzy hybrid flow shop problems with setup time and lot size. Also, Goli 
et al. [20] proposed a comprehensive model of demand prediction based on hybrid artificial intelligence 
and metaheuristic algorithms in the dairy industry. Moreover, Shahsavari et al. [21] suggested a novel 
GA for a flow shop scheduling problem with fuzzy processing time. On this subject, Sanagooy Aghdam 
et al. [22] proposed a heuristic method of GA and Simulated Annealing (SA) for the purpose of placing 
readers in an emergency department of a hospital. Recently, Khalili and Mosadegh Khah [23] presented 
a new mathematical optimization model using queuing theory to determine the hotel capacity in an 
optimal manner. On this subject, Rezaee and Pilevari [24] presented a mathematical model of sustainable 
multilevel supply chain using a meta-heuristic algorithm approach. Also, Alizadeh Firozi et al. [25] used 
an uncapacitated single allocation hub location problem for uncapacitated single allocation hub location 
problem. 


In the field of application of GA in curve fitting, Irshad et al. [26] proposed a technique to capture the 
outline of planar objects based on two rational cubic functions for approximating the boundary curve 
using GA. It is worth mentioning that GA is an approach for an optimal selection of the number and 
location of the knots. This issue was introduced by Holland [27] for the first time. Afterward, Lee [28] 
changed the search space of the GA to a discontinuous type and used one-hot encoding to show the 
knots. In addition, Yoshimoto et al. [29] proposed a coded GA in curve fitting. In this regard, Pittman 
[30] showed the knots by integer coding. In his method, the number of the knots is assumed to be fixed 
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and only their location must be optimized. In addition, Tongur and Ulker [31] estimated curve knot points, 
which are found for curves by using Niched Pareto GA. Garcia et al. [32] proposed a hierarchical GA to 
counter the B-Spline curve interpolation problem. Their proposed approach helps identify the number and 
location of the knots, and it is capable of simultaneous determination of coefficients of the B-Spline 
function. Galvez et al. [33] introduced an adapted elitist clonal selection algorithm for automatic knot 
adjustment of B-Spline curves that determines the number and location of knots to obtain accurate data. 
Garcia et al. [32] applied a hierarchical GA to tackle the B-Spline curve-fitting problem. Fengler and Hin 
[34] proposed a simple and general approach to fitting the discount curve under no-arbitrage constraints 
based on a penalized shape-constrained B-spline. Liu et al. [35] suggested jump-detection and curve 
estimation methods for the discontinuous regression function. Wu et al. [36] surveyed the problem of 
fitting scattered data points with ball B-Spline curves and then proposed a corresponding fitting algorithm 
based on the particle swarm optimization algorithm. 


On this subject, Karadede and Ozdemir [37] suggested a hierarchical soft computing model for estimating 
the parameter of curve fitting problems consisting of three stages. Afterward, Ramirez et al. [38] applied a 
parallel hierarchical Genetic Algorithm (GA) and B-splines to solve the curve-fitting problem of noisy 
scattered data using a multi-objective function. Recently, Li and Lily [39] proposed an approach based on 
an extreme learning machine algorithm to solve nonlinear curve fitting problems. Also, Yeh et al. [40] 
provided a new algorithm for curve fitting by a B-Spline of arbitrary order to determine the knot vector. 
They utilized a feature function that describes the amount and spatial distribution of the input curve. 


Generally, the distribution of knots in splines is a nonlinear optimization problem. To solve this problem, 
researchers used some methods such as the Bayesian model, MCMC, generalized mix model-based 
continuous optimization algorithm, and penalized regression spline, the constrained approach to the 
boundary curve. Recently, some works used the GA algorithm to counter the spline curve interpolation 
problem. Choosing the number and location of the knots is an important challenge in data interpolation 
through spline regression. As aforementioned, a lot of attention has been given to estimating the number 
and location of the knots. Hence, in this study, a new GA has been employed based on three approaches, 
including Least Squares Error (LSE), Capability Process Index (CPI), and a combination of these two 
functions for curve fitting. In the rest of the paper, at first, the proposed GA is discussed in detail. 


In the third section, the performance of the proposed approaches is evaluated, and three estimation 
methods of the proposed algorithm are compared. Afterward, a sensitivity analysis of the number of knots 
is illustrated by an example. Section 4 provides one of the applicability of the proposed method applied in 
change point estimation in the monitoring curve of the cooling equipment sales. Finally, the conclusion 


and further researches are given in the last section. 
2 | Proposed Method 


Consider a vector x = [Spree x, | fitted within different intervals with different response variables 
y= ake AF: <n. This paper focuses on the estimation of the regression parameters. Assume that 


A ={a =X SX << X= b) is a partition on I= [a,b] interval in which distances between points are not 


necessarily equal. A spline is a function that is constructed piecewise from polynomial functions. To 
provide a visual interpretation, a schematic concept of spline regression is shown below: 


Fig. 1. A demonstration of spline regression for curve fitting. 


“In general, the setup of fixed knots is an arbitrary restriction of the set of available spline curves” [29]. 
Therefore, it is first considered that fixed knots to solve the problem. It is worth mentioning that the 
proposed method is a flexible model for normal data in which residuals are normally distributed. To 
construct the chromosomes showing knots vector, Haupt and Haupt [41] is appllied. In the GA scheme, 
initially, a population of chromosomes is randomly produced in which each chromosome is an answer 
vector for the knots vector. After that, the selected chromosomes in the initial population are replaced 
with new chromosomes obtained from mutation and crossover operators. We estimate the regression 
parameters in each interval of x. Then, for population, the objective function is calculated using three 
approaches including least squared error and CPI and the combination of them, which are explained 
briefly in the following. Then, a certain number of the parent chromosomes are selected from the initial 
population. The selection and replacement processes keep on till the completion of the algorithm. 


The coefficients of the spline functions can be estimated by least squares regression. This method is 
used when the type of distribution is exclusively normal. In this method, a certain value for x is the best- 


predicted value for y and f (x) ; 
y = f(x) + noise. (1) 


In which the function fis called regression. Now, the parameters of the distribution are based on the 
minimizing sum of ( f ( x)- y) for all observations. It is also important to verify, in a residual analysis 


that if the assumptions of the white noise residuals are satisfied, we will be sure that the model well 
fitted. 


On another side, the detection and the examination of outliers are important parts of data analysis 
because some outliers in the data may have a detrimental effect on statistical analysis. Many authors have 
discussed outlier detection methods. In this regard, we utilized CPI method. In this method, the 
estimated regression parameters are obtained by applying CPI to the residual. In other words, using this 
index, the process compares the output of the controlled process with the quality specification limits. 
The comparison is made by the ratio of the standard variation of samples from the residuals to 6 times 
of the standard variation residual. There are several statistics can be used to measure the capability of a 
process: Cp, Cpk, and Cpm. For the sake of simplicity, the CPI of Cp is used as one of the objective 
functions of GA. Assume that there is a two-sided specification, and USL and LSL are the upper and 
lower specification limits value, respectively. In Statistical Process Control (SPC), Cp is calculated as 


follows: 


USL -LSL 
P 60 


(2) 


As can be seen in simulation studies, using applying separately two mentioned objective functions leads 
to satisfactory results. It is worth mentioning that considering simultaneously these approaches have 
more desirable answers than separately. Applying the least-squares criterion may remove a chromosome 
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that has only a few outliers. On the other hand, GA based on the objective function of CPI may give an 
answer in which all response variables are near to the control limit, while total error is unneglectable. 
According to this concept and conducting some simulation results, it can be easily concluded two 
approaches act against each other. Hence combining them with a weighting method leads to a better 
solution. The proposed method based on the two estimation functions is as follows: 


a(e(x)-y) -p{ SEE 8) 


60 


Where two parameters of « and 6 are the weights for two objective functions of least squares criterion 


and the CPI, respectively. 
2.1 | Proposed GA Pseudo-Code to Estimate the Spline Regression 


Succinctly, the recommended pseudo-code of the GA algorithm is given in the following. 


I. Determining the input parameters. 
II. Producing the primary population from 0 and 1 values. 
Ill. Transforming the 0 and 1 values to D vector or knots vector. (D vector is defined clearly in [34]). 
IV. Estimating regression parameters using LSE for each interval of D vector of the population. 
V. Calculating the fitting function according to Eg. (3). 
VI. Comparing the minimum function as the best answer with the best answer in the previous replication. 
VII. Creating a new population according to the following method: 


— Select two vectors from D vectors from the current population using the selection operator with the occurrence 
probability of pe. 

— Transforming the selected vectors D to 0 and 1 vectors and applying the crossover operator to create new vectors. 

— Select a vector from D vectors using the selection operator with the occurrence probability of pm. 

— Transforming the selected D vectors to 0 and 1 vectors and applying mutation operator on the new vector. 

— Adding the new population to the initial population, as well as pr times the initial population. 


3 | Performance Evaluation of the Proposed Method 


In this subsection, we present some simulation results to evaluate the performance of the proposed 
method. In this regard, we use some multiple linear regression examples. In this method, we should obtain 
parameters such as percentage of crossover, and mutation operators to enhance the efficiency of the 
proposed GA. To achieve this goal, the trial and error method is employed to adjust parameters. In this 
regard, the given parameter including crossover operation percentage is tuned 65 percent and mutation 
operation is 20 percent (pm=0.2, pc=0.65). Similarly, the rate of selecting superior answers from the initial 
population is 5 percent, (pr=0.05). The chromosomes of the population are assumed 100. Genes in each 
chromosome and the particles in each gene are equal to 20 and 7, respectively. In addition, the replications 
in each run are equal to 3000. The assumed functions with determined knots are shown in Tab 7. In this 
regard, we minimize the total error based on only the least squared error. Afterward, it is similarly assumed 
that maximizing the CPI is the objective function. Then, the combination of the two functions is 
considered. It should be noted that we use a Mean Squared Error (MSE) criterion to appraise and compare 
proposed methods with different objective functions 


Table 1. The assumed functions in each interval created by knots. 


Number Functions 


Y.=x1 3x2 5x3 8x4 


1 

2 Yo=x1 8x2 7x3 3x4 
3 Y3>=2x 4x2 6X3 6x4 
4 Y4=2x 3x2 1x3 9x4 


5 Y5=2x1+5x2+7x3+3x4 
6 Yo=2x1+3x2+6x3+ 5x4 
7 Y7=x1t4x2+2x3+ 8x4 
8 
9 


Ys=x1 2x2 7x3 2x4 
Yo=4x1+8x2t9x3+ 8x4 

10 Yi90=xX1+3x2t 6x3+6x4 

Knot vector — [17,36,43,43,54,58,62,69,93,100] 


Objective functions based on the LSE and the CPI separately may have satisfactory results. However, 
they may not achieve the appropriate fitting. Because, as expected, in the least squared error, the 
emphasis is on the total deviations. On the other hand, in some cases, we may witness excessive error 
while in the majority of points; the fitting has been well achieved. Therefore, in this situation, this 
approach is not able to select an appropriate model for curve fitting. On the contrary, when the CPI is 
used, may there be considerable errors in the majority of points with fewer outliers. With fewer outliers. 
As a result, combining two functions and employing them simultaneously is acceptable to achieve more 
appropriate fitting and lower MSE criterion. As mentioned before, the assumed knot vector is 
considered in Tabe 1. Also, the knot vector obtained from the proposed method is illustrated as an 
estimated knot vector in 11 examples in Tables 2, 3, and 4. 


In Tables 2 and 3, an example has been provided with a determined input variable in each run to obtain 
the optimal solution for one of the objective functions (CPI and LSE, respectively) and the other 
objective function calculated with the optimal population. In this regard, its final population has been 
assumed as the population of another objective function to perform once again. Moreover, the three 
approaches including objective function based on the LSE, CPI, and the combination of them are 
applied to the mentioned method in Tabk 4. 


Comparing the results in Tables 2, 3, and 4 shows that the obtained vector of the solution is close to the 
initial knot vector. We can confirm that considering simultaneous two objective functions improves 
substantially the performance of the algorithm. In summary, the simulation results indicate that the 
proposed method is capable of handling behaviors of a wide range of observations on the sub-intervals. 


Table 2. Superior performance of CPI approach to the LSE approach. 


CPI Function Corresponding LSE Function 
MSE 58.70 69.90 


[17, 36, 43, 43, 54, 58, 62, 69, 93, 100] [17, 25, 43, 55, 60, 70, 78, 87, 91, 100] 


n 


Knot point 


MSE 299.70 305.40 
Knot points [17, 26, 58, 65, 67, 89, 97, 100, 100, 100] = [15, 25, 59, 68, 69, 82, 100, 100, 100, 100] 


MSE 121.60 769.90 
Knot points [1, 6, 22, 46, 48, 60, 86, 91, 100, 100] [12, 60, 67, 87, 91, 92, 100, 100, 100, 100] 
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Table 3. Superior performance of the LSE approach to the CPI approach. 


LSE Function 


Corresponding CPI Function 


MSE 
Knot Points 


MSE 
Knot Points 


MSE 
Knot Points 


152.20 
[33, 35, 39, 52, 55, 80, 90, 100, 100, 100] 


147.50 
[1, 9, 12, 28, 45, 68, 74, 82, 92, 100] 


64.40 
[12, 20, 39, 45, 55, 76, 77, 100, 100, 100] 


243.10 


194.10 


86.80 


[33, 44, 58, 63, 69, 78, 86, 90, 100, 100] 


[1, 9, 12 28, 34, 45, 69, 85, 92, 100] 


[12, 34, 45, 54, 69, 73, 82, 83, 86, 100] 


Table 4. Comparison of the proposed methods. 


CPI Function LSE Function CPI+LSE 

MSE 79.90 0.1 0 

Knot points [8, 20, 27, 36, 68,71, [17, 25, 35, 43, 55,60,  [17, 25, 35, 43, 55, 60, 
85, 87, 95, 100] 70, 87, 91, 100] 70, 87, 90, 100] 

MSE 41.60 0.3 0 

Knot points [15, 40, 41, 44, 59, [17, 25, 35, 43, 56,61,  [17, 25, 35, 43, 55, 60, 
69, 71, 93, 94, 100] 70, 87, 91, 100] 70, 87, 90, 100] 

MSE 26.70 0.1 0 


Knot points 


MSE 
Knot points 


MSE 
Knot points 


[9, 25, 29, 39, 54, 58, 
71,79, 81, 100] 


970.90 

[12, 68, 84, 89,91, 
92, 97, 100, 100, 
100] 


67.50 
[31, 33, 36, 40, 58, 
59, 61, 70, 85, 100] 


[17, 25, 35, 43, 55, 60, 
70, 86, 90, 100] 


27 
[17, 25, 34, 38, 55, 60, 
70, 87, 91,100] 


0.1 
[17, 25, 35, 43, 55, 60, 
70, 87, 91, 100] 


[17, 25, 35, 43, 55, 60, 
70, 87, 90, 100] 


0.1 
[17, 25, 35, 43, 55, 60, 
70, 87, 91, 100] 


0 
[17, 25, 35, 43, 55, 60, 
70, 87, 90, 100] 


Table 5. The effect of the number of knots on total square error (MSE values). 


Number of Knots CPI Function LSE Function Combining the Two Functions 
2 297.00 249.60 220.40 

3 256.20 266.70 200.60 

4 184.30 189.00 166.20 

5 172.40 184.0 92.30 

6 162.80 147.70 86.10 

7 102.10 81.60 68.50 

8 97.30 60.90 59.70 

9 45.90 48.50 36.50 

10 26.70 0.1 0 


3.1 | Sensitivity Analysis of the Number of Knots 


First, we use simulations to demonstrate that our method is not sensitive to some knots. We assume the 
numerical example and the mentioned parameters in the previous section, the results of Tab/e 5 indicate 
that the combined method has better performance than the other methods. As we expected, as the number 
of knots increases, the performance of all the proposed methods increases. In addition, as shown in Fig. 4, 
the more the number of knots is assumed, the fewer differences between the methods LSE and CPI. Note 


that when the number of knots is more than one, the proposed method can be used. 
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—-— Combining the two functions 
The least square error function 


— — - Process Capability Index function 
Fig. 4. Comparison of the performance of the proposed methods. 


Real world numerical data is usually difficult to analyze. To this end, the idea of the applying GA in 
spline is developed. As shown in Tables 2-4, simulation results confirm that proposed GA help 
substantially to spline curve fitting. Using the proposed algorithm, a series of unique polynomials are 
fitted between each of the data points, with the stipulation that the curve obtained be continuous and 
appear smooth. There are many cases in which spline plays an important role in data analysis. Wherever 
spline is used, the proposed algorithm can be attractive and help improve splines. Hence, the proposed 
method can be used in various fields from a managerial point of view. 


4 | A Comparative Study 


Due to the lack of an analytic expression for optimal knot locations, different methodologies in the 
specialized literature have been demonstrated for the selection and optimization of knot vectors. Some 
fast deterministic methods employ. However, in the case of complex point clouds, the results are far 
away from the optimum. Alternatively, metaheuristic methods especially the GA algorithm yield knot 
vectors which are very close to the optimum, but only converge slowly and are, therefore, time- and 
computing power-consuming. Furthermore, the performance of these algorithms is seriously affected 
by the occurrence of data gaps. Recently, Bureick et al. [42] proposed an elitist GA to solve the knot 
adjustment problem for B-Spline curves despite the possible occurrence of data gaps. It is worth 
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mentioning that we focused on the determination of knot location and knot vector size. In reality, we 
try to realize model selection and knot vector determination simultaneously. By contrast, Bureick et al. 
[42] focused solely on knot vector determination. 


To evaluate the efficiency of proposed algorithm, our method and the elitist GA are applied to a test 
function. To evaluate the capability of the proposed algorithm, the test functions are introduced in 
Yoshimoto et al. [29] according to Eg. (4). The chosen parameters for both algorithms to obtain the 
subsequent results are gathered mainly from the literature. 


120 1 
Test fuction 
jod © Proposed method | 
» Existing Method 
80F 7 
607 | 
407 7 
20+ 7 


0 01 02 03 04 05 o6 0.7 os 09 1 


Fig. 5. Comparison results for test function with given parameters. 
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Generally, the simulation results show that the proposed GA algorithm is a simple method and performs 
slightly better than comparative methods. However, elitist GA solves the knot adjustment problem in a 
faster manner than the proposed approach. 


5 | Application 


The estimation of locations of the knots in spline functions can be used in different applications. Recently, 
Toutounji and Durstewitz [43] have utilized this concept and detected multiple change points using 
adaptive regression splines with application to neural recordings. One another application is in SPC. In 
this respect, control charts are one of the most important tools in monitoring a process, but most control 
charts delay warning alarm of a change in the process. The real-time in which the process changes is called 
a change point. To save time and cost, its estimation is an important issue in SPC. In this regard, if the 
knots are assumed as the change points, the multiple change points in monitoring the qualitative 
characteristic can be estimated through estimation of the number and location of the knots. Hence, 
multiple change points can be considered an interesting application. So far, some studies have been 
conducted to estimate the multiple change points that are restricting assumptions such as fixity in the 
number of change points. 


In this subsection, we specifically illustrate the implementation of our method in the estimation of multiple 
change points. Note that the knot numbers obtained from the proposed method are illustrated as the 
estimated change point vector. To validate the algorithm in this example, we consider the multiple change 
point vectors within the given ranges as the determined change point vector in Table 6. As can be seen, 
different functions of the simple linear regression model are assumed in this table. Now, simulation results 
are calculated using the proposed algorithm and the values of pm=0.2 and pc=0.65 are obtained by 
adjusting the algorithm parameters. The MSE for the CPI, the least squared error, and the combination of 
the two functions are equal to 72.40, 39.20, and 37.6, respectively. The change points are illustrated in Tab% 
6 to show the applicability of the proposed method in this real example. As shown in this table, the 
estimations of change points obtained by the proposed method are close to the determined change points. 


Table 6. A real example of the multiple change points. 


Simple Linear Profile Functions within Different Ranges of the Independent Variable 


y, =3+x 
y, =8+6x 


CO mAATDN KW NY e 
< 
a 
ll 
ol 
+ 
N 
x 


= 
© 


Yo =3+9x 

Determined change points 

[17, 25, 35, 43, 55, 60, 70, 87, 90, 100] 

The estimations of change points by the proposed method 
[7, 19, 31, 47, 56, 62, 69, 81, 91, 100] 


6 | Conclusion 


Among nonparametric, a spline model is one of the regression models with considerable statistical 
interpretation. However, this method requires the identification of knots. In this regard, we proposed a 


regression spline based on a GA that is intuitively appealing and simple. The proposed algorithm is 

provided to estimate the number and the location of the knots simultaneously. The proposed algorithm IRI E 
has the specified ability to handle data with changeable behaviors on certain sub-intervals. The proposed 

GA is based on the LSE considering CPI. The performance of the proposed method was evaluated 

using several numerical examples. Also, it was shown that the more the number of knots is assumed, 

the fewer differences between the methods LSE and CPI. Note that this algorithm can be used for 408 
Normal-type of observations with normal residuals. In the end, the applicability of the proposed 

algorithm was shown using a real example. 


For further researche, functions and methods defined in spline regression may be applied in the 
objective function to minimize total error. The dependence of the proposed method on residuals 
distribution may also be eradicated by applying other nonparametric regression. In addition, a new GA 


can be provided for non-Normal data. 
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